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Abstract 

Constructing a spanning tree of a graph is one of the most basic tasks in graph theory. 
Motivated by several recent studies of local graph algorithms, we consider the following variant 
of this problem. Let G be a connected bounded-degree graph. Given an edge e in G we would 
like to decide whether e belongs to a connected subgraph G' consisting of (1 + e)n edges (for a 
prespecified constant e > 0), where the decision for different edges should be consistent with the 
same subgraph G'. Can this task be performed by inspecting only a constant number of edges 
in G? Our main results are: 

• We show that if every t -vertex subgraph of G has expansion l/(logt) 1+ °(b then one can 
(deterministically) construct a sparse spanning subgraph G' of G using few inspections. To 
this end we analyze a “local” version of a famous minimum-weight spanning tree algorithm. 

• We show that the above expansion requirement is sharp even when allowing randomization. 
To this end we construct a family of 3-regular graphs of high girth, in which every t -vertex 
subgraph has expansion l/(logf) 1_ °d). 


1 Introduction 

Given a graph G, one of the most basic tasks one would like to perform on G is to find a spanning 
tree of it or perhaps some other sparse spanning subgraph G'. This task can be easily accomplished 
using numerous well-known algorithms such as DFS (depth-first search), BFS (breadth-first search) 
and more. What all of these algorithms have in common is that in order to decide whether a given 
edge e belongs to the spanning subgraph G ', one has to construct the entire spanning tree. Suppose 
however that one is not interested in constructing the entire spanning subgraph G'. but rather to 
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be able to “quickly” tell if a given edge e belongs to G' or not. By “quickly” we mean using a 
constant number of operations. 

Such algorithms are of importance in distributed settings, where processors reside on the ver¬ 
tices of the graph and the goal is to select as few communication links (edges) as possible while 
maintaining connectivity. Another relevant setting is one in which the graph resides in a centralized 
database, but different, uncoordinated, servers have access to it, and are interested in only parts of 
a common sparse spanning subgraph. In both cases we would like the decision regarding any given 
edge to be made after inspecting only a very small portion of the whole graph, but all decisions 
must be consistent with the same spanning subgraph. Such algorithms belong to a growing family 
of local algorithms for solving classical problems in graph theory. We elaborate on relevant related 
works in Subsection 1.1. 

Let us make a simple observation regarding the task of locally constructing a spanning subgraph. 
Note that if one insists on locally constructing a spanning tree G' , then it is easy to see that the 
task cannot be performed in general without inspecting almost all of G; that is, this task cannot 
be achieved using a constant number of queries to G. To see this, observe that if G consists of a 
single path, then the algorithm must answer positively on all edges, while if G consists of a single 
cycle then the algorithm must answer negatively on one edge. However, the two cases cannot be 
distinguished without inspecting a linear number of edges. 

So suppose we allow the algorithm some slackness, and rather than requiring that G' be a tree, 
only require that it be relatively sparse, i.e., contain at most (1 + e)n edges. Summarizing, the 
question is then, given e > 0, for which graphs G can we locally construct a spanning subgraph G' 
consisting of (1 + e)n edges, such that given an edge e € E{G) one can determine if e € G' using a 
constant (that may depend on e but not on n ) number of queries to G? 

Our main result in this paper, stated informally as Theorem 1 below, shows that the answer to 
the above question is given by a certain variant of graph expansion, which we now turn to define. 
For a graph G and a subset S C H(G), we write dc(S) for the set of edges of G with precisely one 
endpoint in S. We write (fc for the (edge) expansion of G, that is, (fa = mins |9 g(*S')| / I*?! where 
the minimum is taken over all S C V(G) of size 1 < IS) < |H(G)|/2. Note that a graph may have 
small expansion yet contain (large) subgraphs with large expansion. For example, a vertex-disjoint 
union of cliques has expansion 0, yet it contains complete graphs that have the largest expansion 
possible (for graphs of their order). Let us thus say that a graph is f-non-expanding if every t- vertex 
subgraph H satisfies 4>h < f(t) (we assume t > 2). 

Our main result in this paper can be informally stated as follows. 

Theorem 1 (Informal Statement). We have the following dichotomy: 

• If G is f -non-expanding for f(t) <C 1/logf then one can locally construct a sparse spanning 
subgraph of G. The algorithm is deterministic. 

• There is a family of 3-regular graphs G n that are (roughly) 1 / log t-non-expanding so that 
every (possibly randomized) local algorithm for constructing a sparse spanning subgraph of 
G n must accept every edge of G n . 

We refer the reader to Definition 1 for the precise definition of what it means to locally construct 
a sparse spanning subgraph, and to Theorems 2 and 3 for the precise statements of the two assertions 
in Theorem 1. 


2 


We note that there are numerous families of graphs that satisfy the condition in the first item 
of Theorem 1. For example, it follows from the planar separator theorem of Lipton and Tarjan [23] 
and its extension by Alon, Seymour and Thomas [3] that planar graphs (and more generally, H- 
minor-free graphs) of bounded degree satisfy the condition of the first item. Also, observe that 
since the graphs G n in the second item of Theorem 1 have 3n/2 edges, there is no algorithm that 
can locally construct a spanning subgraph of G n with (1 + e)n edges for e < 1/2. 

We make two comments regarding the results which appeared in the preliminary conference 
version of this paper [21]. First, it was shown in [21] that there are graphs such that any algorithm 
has to inspect £l(y/n) edges in order to decide whether a given edge belongs to a spanning subgraph 
G' containing (1 + e)n edges, for a constant e. However, those graphs resulted from random graphs, 
which have expansion 0(1), and so could not be used in order to show that the non-expansion 
requirement given in the first item of Theorem 3 cannot be relaxed. Second, it was shown in [21] 
that for certain families of graphs, one can solve the sparse spanning subgraph problem in time 
0(y/n). It is an interesting open problem to decide whether this can be extended to hold for all 
bounded-degree graphs. In fact, it would even be interesting to show that for any bounded-degree 
graph G, one can find a sparse spanning subgraph using o(n ) queries 1 . 

1.1 Related work 

As is evident from the above description of the problem, the model we study here is similar to both 
classical models, such as distributed and parallel computation, and to more recent ones. In what 
follows, we describe these models and some related results, so as to provide a broad context for our 
work. 

1.1.1 Local algorithms for other graph problems 

The model of local computation algorithms as considered in this work, was defined by Rubinfeld 
et al. [36] (see also Alon et al. [2]). Such algorithms for maximal independent set, hypergraph 
coloring, fc-CNF and maximum matching are given in [36, 2, 25, 26]. This model generalizes other 
models that have been studied in various contexts, including locally decodable codes (e.g., [24]), 
local decompression [14], and local filters/reconstructors [1, 37, 9, 18, 17, 12]. Local computation 
algorithms that give approximate solutions for various optimization problems on graphs, including 
vertex cover, maximal matching, and other packing and covering problems, can also be derived 
from sublinear time algorithms for parameter estimation [33, 27, 31, 15, 40]. 

The model of local computation is related to several other models, including property testing 
and online algorithms. To give a notable example, Mansour et al. [25] proposed a general scheme 
for converting a large family of online algorithms into local computation algorithms, consequently, 
improving the complexity of hypergraph 2-coloring and fc-CNF in the local computation model. 

In the related held of local reconstructors, Campagna et al. [10] study the property of connec¬ 
tivity. Namely, under the promise that the input graph is almost connected, their reconstructor 
provides oracle access to the adjacency matrix of a connected graph which is close to the input 
graph. We emphasize that our model is different from theirs, in that they allow the addition of new 

1 Note that if we are allowed to make O(n) queries, then we can just use the standard BFS or DFS algorithms, 
which find the entire spanning tree of G. 
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edges to the graph, whereas our algorithms must provide spanning graphs whose edges are present 
in the original input graph. 

1.1.2 Distributed and parallel algorithms 

The name local algorithms is also used in the distributed context [28, 30, 22], As observed by Parnas 
and Ron [33], local distributed algorithms can be used to obtain local computation algorithms as 
defined in this work, by simply emulating the distributed algorithm on a sufficiently large subgraph 
of the graph G. However, while the main complexity measure in the distributed setting is the 
number of rounds (where it is usually assumed that each message is of length O(logn)), our main 
complexity measure is the number of queries performed on the graph G. By this standard reduction, 
the bound on the number of queries (and hence running time) depends on the size of the queried 
subgraph and may grow exponentially with the number of rounds. Therefore, this reduction gives 
meaningful results only when the number of rounds is significantly smaller than the diameter of 
the graph. 

While the problem of computing a spanning graph has not been studied in the distributed 
model, the problem of computing a minimum-weight spanning tree is a central one in this model. 
Kutten and Peleg [20] provided an algorithm that works in 0(y/nlog* n + D) rounds, where D 
denotes the diameter of the graph. Their result is nearly optimal in terms of the complexity in n, 
as shown by Peleg and Rubinovich [34] who provided a lower bound of ^(yTi/logn) rounds (when 
the length of the messages must be bounded). 

Another problem studied in the distributed setting that is related to the one studied in this 
paper, is finding a sparse spanner. The requirement for spanners is much stronger since the dis¬ 
tortion of the distance should be as small as possible. Thus, to achieve this property, it is usually 
the case that the number of edges of the spanner is super-linear in n. Pettie [35] was the first 
to provide a distributed algorithm for finding a low distortion spanner with O(n) edges without 
requiring messages of unbounded length or 0(D ) rounds. The number of rounds of his algorithm 
is log 1+o(1) n. Hence, the standard reduction of [33] yields a local algorithm with a trivial linear 
bound on the query complexity. 

The problems of computing a spanning tree and a minimum weight spanning tree were studied 
extensively in the parallel computing model as well (see, e.g., [7], and the references therein). 
However, these parallel algorithms have time complexity which is at least logarithmic in n and 
therefore do not yield an efficient algorithm in the local computation model. See [36, 2] for further 
discussion on the relationship between the ability to construct local computation algorithms and 
the parallel complexity of a problem. 

1.1.3 Local cluster algorithms 

Local algorithms for graph theoretic problems have also been given for PageRank computations on 
the web graph [16, 8, 38, 5, 4]. Local graph partitioning algorithms have been presented in [39, 5, 6, 
41, 32], which find subsets of vertices whose internal connections are significantly richer than their 
external connections in time that depends on the size of the cluster that they output. For instance, 
Andersen and Peres [6] provide an algorithm which, given a starting vertex v, finds a cluster of v of 
small conductance, whose complexity depends on the volume of the cluster it outputs but has only 
polylogarithmic dependence in the size of the graph. However, even when the size of the cluster 
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is guaranteed to be small, it is not obvious how to use these algorithms in the local computation 
setting where the cluster decompositions must be consistent among queries to all vertices. 

1.1.4 Other related sublinear-time approximation algorithms for graphs 

The problem of estimating the weight of a minimum-weight spanning tree in sublinear time was 
considered by Chazelle, Rubinfeld and Trevisan [11]. They describe an algorithm whose running 
time depends on the approximation parameter, the average degree and the range of the weights, 
but does not directly depend on the number of vertices. 

1.2 Organization 

The rest of the paper is organized as follows. In Section 2 we formally define the local sparse 
spanning subgraph problem which we consider in this paper. In Section 3 we prove the first item 
of Theorem 1, which is formally stated as Theorem 2. The proof of this part has two main steps. 
In the first one, we show that if G is /-non-expanding with / <C 1/ log t then one can remove from 
G only a relatively small number of edges and thus partition it into connected components of size 
0(1) each. We then show that if a graph can be so partitioned, then one can solve on it the local 
spanning subgraph problem by executing a “localized” version of Kruskal’s [19] famous algorithm 
for finding minimum-weight spanning tress 2 . 

The proof of the second paper of Theorem 1, which is the more challenging part of this paper, is 
given in Section 4 and formally stated as Theorem 3. It establishes that the 1/log t- non-expansion 
requirement from the first item of Theorem 1 is essentially tight. What we show is that there are 
graphs which are (about) 1/log t- non-expanding, and have the property that any local algorithm 
for constructing a spanning subgraph using a constant number of queries must accept every edge 
of the graph. To prove this result we describe a construction of certain extremal graphs that might 
be of independent interest. These are 3-regular graphs, that on one hand have unbounded girth 3 , 
but on the other hand are (about) 1 / log t-non-expanding. 

We make no serious attempt to optimize the constants obtained in the various statements. In 
fact, the /-non-expansion requirements in our upper and lower bound results (Theorems 2 and 3), 
which are about (l/logf)(l/loglogf) 2 and (1/log t) (log log t) 2 respectively, can each be improved 
by replacing the (log log t) 2 term by (log log t) 1 " 1-0 ^. We opted for proving our results with the 
slightly weaker bounds in order to simplify the presentation. We henceforth write log(-) for log 2 (-)- 

2 Preliminaries 

Let us now give the precise definition of the algorithmic problem we are addressing in this paper. 
As in most cases where one tries to design a local/distributed/sublinear algorithm, we will assume 
that the input graph is given via an oracle access to its incidence-list representation, meaning the 
following: First, we assume that the input graph G = (V, E) is given via incidence-lists representa¬ 
tion, that is, for each vertex v € V(G), there is an ordered list of its neighbors in G. Second, the 

2 Recall that if G is a graph with weights assigned to its edges, then Kruskal’s algorithm finds a spanning tree of 
minimal total weight 

3 As usual, the girth of a graph is the minimum length of a cycle in it. 
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algorithm is supplied with integers n and d, that represent the number of vertices, and an upper 
bound on the degrees of vertices of G. Finally, given a pair (v, i ) with 1 < v < n and 1 < i < d, the 
oracle either returns the i th neighbour of v (in the incidence list representation) or an indication 
that v has less than i neighbours. We will assume that each vertex v has an id, id(v), where there 
is a full order over the ids. We will think of the ids of vertices in the graphs simply as the integers 
{1,... ,n}. We now turn to formally define the algorithmic problem we consider in this paper. 

Definition 1. An algorithm A is an (e,g , )-local sparse spanning graph algorithm if, given n,d> 1 
and oracle access to the incidence-lists representation of a connected graph G = ( V , E) on n vertices 
and degree at most d, it provides query access to a subgraph G' = (V, E') of G such that: 

i. G’ is connected. 

ii. \E'\ < (1 + e) • n with probability at least 2/3 (over the internal randomness of A). 

Hi. E' is determined by G and the internal randomness of A. 

iv. A makes at most q queries to G. 

By “providing query access to G' ” we mean that on input (u,v) € E, A returns whether (u,v) € E' 
and for any sequence of queries, A answers consistently with the same G'. 

An algorithm A is an (e, g , )-local sparse spanning graph algorithm for a family of graphs C if 
the above conditions hold, provided that the input graph G belongs to C. 

We note that the choice of the required success probability being 2/3 is of course arbitrary and 
can be replaced by any probability smaller than 1. Having said this, let us stress that the positive 
results we obtain here (i.e., the algorithmic results) in Theorem 2 are deterministic (i.e., hold with 
probability 1), whereas our lower bound in Theorem 3 holds for any positive success probability. 
We also note that even though Definition 1 considers only the number of queries performed by the 
algorithm, our algorithm in Theorem 2 runs in time polynomial in the number of queries, and in 
particular, independent of n. 

We are interested in local algorithms that have query complexity which is independent of n, 
namely, that perform a constant number of queries to the graph (for each edge they are queried on) 
and whose running time (per queried edge) is small as well. In the next section, we show that the 
family of graphs that are /-non-expanding with / <C 1/ log t have a local sparse spanning graph 
algorithm. In the following section, we will show that one cannot prove such a result when / is 
only slightly larger. 

3 Upp er bound 

In this section we prove the following theorem, which formalizes the first assertion of Theorem 1. 

Theorem 2. For every C there is a function q : M+ x N —>• N so that for every e > 0 there is an 
(e,q(e,d))-local sparse spanning graph algorithm for the family of f-non-expanding graphs with 

^ ^ log x ■ (log log x) 2 ’ ^ ^ 

where d is the input degree-bound. Furthermore, the algorithm is deterministic. 
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3.1 Decomposition of non-expanding graphs 

The first step in the proof of Theorem 2 is a decomposition result stated in Lemma 1 below. It 
shows that if G is /-non-expanding, with / as in Equation (1), then G can be decomposed into 
connected components of bounded size by removing only en edges. This extends a result of [13] 
that applies for somewhat larger /. As mentioned earlier, there are many families of graphs which 
are /-non expanding with / as in Equation (1). For example, planar graphs of bounded degree 
are /-non-expanding with / = 0( 1/y/x) by the famous planar separator theorem of Lipton and 
Tarjan [23]. More generally, a result of Alon, Seymour and Thomas [3] implies that for any fixed 
H, the family of iL-minor-free graphs of bounded degree is /-non-expanding with / = 0(\/yfx). 
Hence, Lemma 1 applies to these families of graphs in particular. We note that the reason why the 
bound in Lemma 1 is doubly exponential in e is that we insist on assuming that / is very close to 
the threshold of \/\ogx (which by Theorem 3 is essentially tight). For example, the details of the 
proof of Lemma 1 show that if / = x~ c for some 0 < c < 0 (as is the case with planar graphs, say), 
then the bound can be improved to polynomial in 1/e. We note that in such cases we can also set 
k = poly (1/e) in step 1 of our algorithm (Algorithm 1 below), thus obtaining a much more efficient 
algorithm. 

Lemma 1. If G is an n-vertex f -non-expanding graph with f(x) = C/logx(loglogx) 2 , then one 
can remove en edges from G so that each connected component of the remaining graph is of size at 
most 2^ CM+ \ 

Proof: First, we claim that any /-non-expanding n-vertex graph G = (V , E ) has a subset S C V(G) 
of size n/3 < |S| < (2/3)n and expansion 4>a(S) = f |9 g(S)| / |S| < /(n/3). For the proof we 
iteratively construct subsets S±, ..., 5/ C V{G) as follows. To obtain Si, we consider the induced 
subgraph G/ = G\V \ U}=i &j] and let Si C V(Gf) satisfy |Sj| < n*/2 and cpc, (S',;) < /(n*), where 

rii = \V(Gi)\. We stop once S = f (Ji=i is °f s i ze l^l ^ n/2>. Note that 


k—1 k —1 

\s\ < ^2 l^l + n fc/ 2 = ( n + X] I 5 *!)/ 2 - 2n / 3 • 

i=l i=l 

It remains to bound fc(S'). Observe that every edge in the edge boundary dc(S) is a member of 
some edge boundary dc, (Si). Hence, 

JMpi< Si-i[^.< 5 ->l = ^ If|fe(s)< mgfc(s.)<,■§«/(".) = /(«*)</("/s). 

where in the last inequality we used the fact that > n — \S\ > n/3. This proves our claim. 

Fix an integer k > 50 and let G = {V. E) be any /-non-expanding graph on n > k/3 vertices. 
Consider the following process; take any subset S C V of size n/3 < [S'! < n/2 and expansion 
4>g(S) < 2/(n/3) (whose existence follows from the claim above), remove all its outgoing edges and 
proceed recursively on the two induced subgraphs G[S] and G\V \ 5], which are /-non-expanding 
as well. The recursion stops whenever we reach a graph on at most k vertices. It is clear that 
at the end of this process, the edges removed from G leave a graph whose connected components 
have at most k vertices each. Let rk(G) be the number of edges removed by the above process. 
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We will shortly prove that if G has n vertices, then r k (G) < C7i/lnln(fc/3). Hence, setting k = 
22 2 (c/ e )+ 3 ^ max{50,3 • e e ° /e } enables us to remove no more than en edges and break G into 
connected components of size at most k, thus proving the lemma. 

In order to facilitate an inductive proof, it will be more convenient to prove the following slightly 
stronger claim: 


n(G) < /3(n) = 


Cn 

In ln(A;/3) 


Cn 
In In n 


(2) 


Set h(x) = a;/In In x and /*(x) = f(x)/C = (log x)” 1 (log log x) -2 . First, we establish some 
properties of h. It is easy to verify that the derivative of h is h'{x) = (lnlnx)^ 1 — (lnx) -1 (lnlnx) -2 , 
and moreover, h"(x) < 0 for x > 20. It follows that for every n > 50 and n/3 < s < n/2 we have 


h(n) — h{n — s) < s ■ h!(n — s) < h(s ) — s • 2/*(n/3) 


(3) 


where in the first inequality we used the concavity of h on the interval (20, oo), and in the second 
inequality we used the fact that n > 50 and n/3 < s < n/2 and that in this range 

(log(n/3)(loglog(n/3)) 2 /2 > ln(2n/3)(lnln(2n/3)) 2 > ln(n — s)(lnln(n — s)) 2 . 


We prove Equation (2) by induction on n. For the base case(s) where (k/3 <) n < k we have 
that f3(n) > f3(k/ 3) = 0 = r^(G), as needed. For the induction step we have 


rk{G) < max \S\ ■ 2/(n/3) + r k (G[S]) + r k (G[V \ 5]) 

n/3<\S\<n/2 

< max s ■ 2/(n/3) + (3(s) + (3{n — s) 

n/3<s<n/2 

= <17^n/Inln(Ar/3) + max s ■ 2/*(n/3) — h(s) — h{n — s)^ 

< c(n/\nhi(k/3) — h(n)j =/3(n) 

where the first inequality follows from the definition of the process described in the second paragraph 
of the proof, the second inequality follows from the induction hypothesis since k/3 < s,n—s < n— 1, 
and in the third inequality we used (3) since n > k > 50. This completes the proof of Equation (2). 


3.2 The algorithm 

The algorithm we design in order to prove Theorem 2 is based on KruskaPs minimum-weight 
spanning tree algorithm [19]. The idea is to assign weights to the edges of the graph in a way that 
will determine some fixed spanning tree T. The algorithm will always accept the edges of T but 
will also accept a few other edges. We will pick the weights of the edges in a way that will make it 
possible to determine the edges of a sparse spanning subgraph in a “local” fashion. 

Recall that KruskaPs algorithm for finding a minimum-weight spanning tree in a weighted 
connected graph works as follows. First it sorts the edges of the graph e\,... e m from minimum to 
maximum weight (breaking ties arbitrarily). It then goes over the edges in this order, and adds 
to the spanning tree if and only if it does not close a cycle with the previously selected edges. Put 
differently: 




Fact 1. Edge e is picked by Kruskal’s algorithm if and only if for any cycle C of G containing e, 
the edge e does not have the largest weight among the edges of C. 

It is well known (and easy to verify) that if the weights of the edges are distinct, then there 
is a single minimum-weight spanning tree in the graph. For an unweighted graph G , consider the 
order defined over its edges by the order of the ids of the vertices. Namely, we define a ranking 
r of the edges as follows: r(u,v) < r(u',v') if and only if min {id(u), id(v)} < min {id(u'),id(v')} 
or min {id(u), id(v)} = min {id(u'),id(v')} and ma x{id(u), id(v)} < ma x{id(u'),id(v')}. If we run 
Kruskal’s algorithm using the rank r as the weight function (where there is a single ordering of the 
edges), then we obtain a (well-defined) spanning tree of G. 

While the local algorithm described next (Algorithm 1) is based on the aforementioned global 
algorithm, it does not exactly emulate it, but rather emulates a certain relaxed version of it which 
can be executed locally. In particular, it will answer YES for every edge selected by the global 
algorithm (ensuring connectivity), but may answer YES also on edges not selected by the global 
algorithm. We will thus need to show that it does not answer YES on too many edges that are not 
selected by the global algorithm. 

In the description and analysis of the algorithm we will use the following standard notation; for 
a vertex v € V and an integer k, we denote by Ck(v, G ) the subgraph of G induced by the set of 
vertices at distance at most k from v. 


Algorithm 1 (Kruskal-based Algorithm) 

(The algorithm works for some fixed e > 0.) 

Input: n,d > 1, query access to a graph G on n vertices and degree at most d. 
Queried edge: (x,y)€E(G). 

1. Set k = 2 22(C/£)+3 . 

2. Perform a BFS to depth k from x, thus obtaining the subgraph Ck(x, G). 

3. If (x,y) is the edge with largest rank on some cycle in Ck(x,G) then answer NO; 
Otherwise, answer YES. 


Proof of Theorem 2: We will show that if G = (V, E ) is C/ log x(log log x) 2 -non-expanding then 
Algorithm 1 is an (e,q(e,d))- local sparse spanning subgraph algorithm, where q(e,d ) = d k+l with 
k being the constant from step 1 of the algorithm. By the description of Algorithm 1 it directly 
follows that the algorithm is deterministic and that its answers are consistent with a connected 
subgraph G'. Indeed, if T is the tree returned by Kruskal’s algorithm, then Fact 1 and step 3 of 
Algorithm 1 guarantee that each edge of T will be accepted by Algorithm 1. Observe that the 
number of queries to G performed by Algorithm 1 is at most d k+1 . We now complete the proof by 
showing that the algorithm returns YES on fewer than (1 + e)n edges. 

Let R be a set of at most en edges whose removal disconnects G into connected components 
of size at most k. The existence of such a set is guaranteed by Lemma 1. Let Gr be the graph 
obtained by removing R from G; that is, Gr = ( V,E\R ). We note (crucially) that while the 
analysis of the algorithm uses properties of Gr, the algorithm does not actually compute R. We 
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will now show that G' does not contain a cycle of Gr. Since \R\ < en, this means that G' has fewer 
than (1 + e)n edges. 

Let a be a cycle in Gr. Suppose (w, v) is the edge of a with the largest rank. Since the 
connected components of Gr are of size at most k, we infer that a has at most k vertices, implying 
that Ck(w,G) contains a. It follows that on query (w,v) the algorithm will return NO. Thus, G' 
does not contain a. ■ 

4 Lower bound 

The next theorem shows that there are graphs for which any local sparse spanning graph algorithm 
must perform a number of queries that grows with n, yet these graphs are /-non-expanding with 
f(x) only slightly larger than 1/logx. This is essentially the best one can hope for in light of 
Theorem 2. 

Theorem 3. For infinitely many n, there is an f -non-expanding n-vertex graph G with 

f(x) = 7 ^— ■ (70 log log xf 
\ogx 

such that every (| ,q)-local sparse spanning graph algorithm for the graphs isomorphic to G satisfies 
q > loglog(n)/8000. 

4.1 A regular non-expanding graph 

The main result in this subsection (stated in Lemma 2) is a construction of regular non-expanding 
graphs that we will use in Subsection 4.2 to prove Theorem 3. A main ingredient is a result from [29] 
showing that, roughly speaking, there are graphs that simultaneously have large girth and small 
hereditary expansion (in fact, small edge separators). While the degree of these graphs may grow 
with n, their maximum degree is at most a constant times their average degree. We will use this in 
order to construct a regular graph with similar properties. We note that the regularity condition 
is crucial for proving Theorem 3. The following theorem was proved in [29]. 

Theorem 4 ([29]). For any n,k with 2 < k < log log n there is an n-vertex graph G = G Uy k 

satisfying: 

i. G has average degree at least k and maximum degree at most 6 k. 

ii. G has girth at least logn/(6fc) 2 . 

Hi. For every t-vertex subgraph H of G that is not a forest, there exists a subset S C V ( H) of 
size (l/3)f < |S| < (2/3)f such that 

|dif(S)| < r~i ' ( lo ^ogt) 2 . 

log t 

We note that each of the parameters in Theorem 3 is quantitatively essentially optimal (see [29] 
for further discussion). 

The main result in this section is the following. 
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Lemma 2. For any no there is a connected graph G a on n> no vertices satisfying: 

i. G 0 is 3-regular. 

ii. G 0 has girth at least log log(n)/2000. 

in. Go is f -non-expanding with f(x) = (1/logx) • (4 log log x) 2 . 

For the proof we will need the weighted version of the well-known vertex separator theorem for 
trees. For completeness, we give a short proof below. 

Claim 3. Let T = (V, E) be a tree, and let w : V —> M + be a nonnegative weight function over the 
vertices of T. There is a vertex v € V whose removal disconnects T into connected components of 
weight at most w(V)/2 each. 4 

Proof: Start a walk in T from an arbitrary vertex, in each step moving from a vertex u to 
a neighbor v! if the weight of the tree rooted at u', when the edge (u, u') is removed, is strictly 
greater than w(V)/2. Since T has no cycles and since the walk never reverts the last step taken, the 
walk eventually stops at some vertex v. This means that when v is removed from T, the weight of 
the tree rooted at each of the neighbors of v is at most w(V)/2. Since these trees are the connected 
components resulting from the removal of v, we are done. I 

Proof of Lemma 2: Set k = log log m/648 and let G m) k be the graph from Theorem 4, where 
we take m to be large enough such that k > min{no,2}. We note that in the rest of the proof we 
will use the inequality 

(6 k) 4 < log m (4) 

which holds since m is sufficiently large. As is well known, by iteratively removing vertices of G m) k 
of degree at most k/2, one obtains a (non-empty) graph of minimum degree at least k/2. Let G be 
a connected component of the largest average degree in the obtained graph. Note that the average 
degree of G is at least k, the maximum degree is still at most 6k, and the girth is still at least 
logm/(6fc) 2 . Finally, G still satisfies item (in) of Theorem 4, being a subgraph of G m ^. 

Let Go be obtained by taking the replacement product of G with a cycle. That is, G 0 is obtained 
from G by replacing each vertex of degree x by a cycle on x new vertices - which we henceforth refer 
to as a “cloud” - and further adding edges as follows: if u, v are adjacent in G, with u being the 
i-th neighbor of v and v being the ji-tli neighbor of u (under a fixed arbitrary enumeration of the 
neighbors of each vertex), then the i-th vertex in the cloud corresponding to v is connected by an 
edge to the j- th vertex in the cloud corresponding to u. So for example, it is easy to see that there 
is a one-to-one correspondence between the edges of G and those edges of G 0 that connect vertices 
from different clouds. Note that our graph G 0 is connected, as needed. Letting n denote the number 
of its vertices, note that n equals the sum of the degrees of all vertices of G, so n > k |P(G)| > no, 
as needed. Furthermore, G 0 is 3-regular, since each vertex has two neighbors in its cloud and one 
neighbor in precisely one other cloud, as required by item (i) of the statement. 

Let us now prove that the girth of G 0 is equal to the minimum between the girth of G and the 
minimum degree of G. First, note that any cycle C in G 0 , other than a cloud, naturally determines 
a closed trail in G (i.e., where vertices may be visited more than once, but not edges). Indeed, for 

4 For a subset X C V we write w(X) = w(v). 
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each edge of C that connects two different clouds, the trail simply moves along the corresponding 
edge in G.. Note that the length of C is at least the length (i.e., number of edges) of the trail. 
Since the length of the shortest closed trail in G is its girth, we conclude that the length of any 
cycle in G 0 is at least the girth of G, unless that cycle is a cloud. Furthermore, since the smallest 
number of vertices in a cloud of G a equals the minimum degree of G, our claim follows. That is, 
the girth of G 0 is at least 

min {log rn/(6k) 2 , k/ 2} = loglog(m)/1296 > loglog(n)/2000 , 

where we used the setting of k, Equation (4) and the fact that n < 6k |V(G)| < 6 km < m 2 . This 
proves item (ii) of the statement. 

It remains to show that G 0 satisfies item (in) of the statement as well. Let H a be a t-vertex 
subgraph of G 0 ■ Our goal is to bound 4>h q from above. Let H be the induced subgraph of G 
obtained by retaining only those vertices whose corresponding cloud has at least one vertex in H a . 
Put h = \V(H)\, and notice t > h. We next consider two cases, depending on whether H is a forest 
or not. 

First, suppose that H is not a forest. Hence, by item (in) of Theorem 4 (a property which is also 
satisfied by G , as mentioned above) there is a partition V(H) = SUS' with |5|, |S"| > h/ 3 satisfying 
|3#(5)| , |(9#(S")| < (h/\ogh) ■ (loglog h) 2 . Let S Q be the subset of V(H 0 ) corresponding to S (i.e., 
obtained by replacing each vertex in S with the vertices of its cloud in H 0 ). Assume without loss 
of generality that (Sol < t/2 (otherwise take S' Q , which is defined from S' in a similar fashion). 
Observe that |<9# 0 (S' 0 )| < |d#(iS')|, since any edge in 8h 0 (S 0 ) must go between two different clouds, 
and there is a unique edge in dn(S) connecting the two vertices corresponding to these clouds. 
Therefore, 

^ \d Ho (S 0 )\ ^ (h/ log h) • (log log h) 2 3(loglogh) 2 ^ 3(loglogf) 2 ^ 6(loglogf) 2 
|5 0 | — hi 3 log h — log(t/6k) — logt 

where in the second inequality we used the fact that |5 0 | > l^l > h/ 3, in the third inequality we 
used the fact that h < t < 6k ■ h, and in the last inequality we used the fact that t/6k > y/t (i.e., 
y/t > 6k)', the latter follows from the fact that since H is not a forest, h is at least the girth of G, 
so t > h > logm/(6k) 2 > (6k) 2 by Equation (4). This proves item (in) of the statement under the 
assumption that H is not a forest. 

Suppose next that H is a forest. Notice we may assume that H is a connected graph since 
otherwise H a is also not connected, meaning that <j)H a = 0 so we are done. We apply Claim 3 
on the tree H, where we set the weight of each vertex in H to be the number of vertices in the 
corresponding cloud in H a . Let v be the vertex guaranteed by Claim 3, and let v\,Vd be the 
vertices of the cloud/cycle corresponding to v, in their order on the cycle. For each 1 < * < d, let 
Si C V(H 0 ) be the set of vertices in H 0 corresponding to the i -th connected components of H — v 
(i.e., so that Vi is the unique vertex in the cloud of v that is connected to Si). Put S[ = Si U {fj}. 
Then Yli=i l^l = L and our choice of v guarantees that 15(1 < t/2 + 1. We claim that there is an 
index 1 < j < d such that (l/4)t < J2i=l I'S’d — (3/4)L Indeed, if 1 < j < d is the smallest index 
such that Yli= l I <5(1 > (1/4)i then 

E 1*511 = E |*5'| + 1*5(1 < (tl 4 - 1) + (t/2 + 1) = (3/4)t . 

1=1 1=1 


12 







Now, let S 0 C V(H 0 ) be the smallest between Ui=i an d its complement, so that i/4 < 15*01 < i/2. 
Observe that since {1, 2,..., j} is an interval, |<9# 0 ( < S' 0 )| < 2. We conclude that 

(t>H 0 < 2/(i/4) = 8/i < (1/logi) • (4 log log i) 2 , 

where in the last inequality we used the fact that (x/ log x) ■ (log logx) 2 > 1/2, which can be verified 
to hold for any real x > 3 (and thus for any integer i > 2). This completes the proof. I 

4.2 Lower bound proof 

For our proof of Theorem 3 we will need the graph witnessing the lower bound to contain a bridge. 
The following lemma shows that one can modify a given graph so as to contain a bridge while 
preserving high girth and small hereditary expansion. 

Lemma 4. Suppose there is a 3-regular connected n-vertex graph G with girth g that is f-non- 
expanding, where f : [1/2, oo) —>• R is monotone decreasing. Then there is a 3-regular connected 
(2 n + 2)-vertex graph that contains a bridge, and moreover, has girth at least g and is h-non- 
expanding with h{x) = 3f(x/2 — 1). 

Proof: Let G i, G 2 be two vertex-disjoint copies of G. Let be an arbitrary edge of G/, i € {1, 2}, 
and let G\ be obtained by subdividing e^. That is, G\ is obtained from G/ by adding a new vertex 
Wi, removing the edge e* = ( Ui,Vi ) and adding the edges ( Ui,Wi ), ( Wi,Vi ). It is clear that subdividing 
an edge does not decrease the girth. Now, construct the graph F from the union of G'i and G’ 2 
by adding the bridge (w\,W 2 ). It is clear that F is 3-regular, connected and has girth at least g. 
It therefore remains to show that F is /z-non-expanding. Let H be a f-vertex subgraph of F with 
t > 2. We need to show that cfn < h(t). Without loss of generality, H has at least t/2 vertices 
in G[. Let H' be the subgraph of H induced by those vertices, where we remove the subdividing 
vertex w\ if w\ € V(H). Note that H' is a subgraph of G\. Let t' > t/2 — 1 denote the number 
of vertices of H'. Since H' is /-non-expanding, there is a subset S C V(H') with |Sj < t'/2 and 
|<9#/(*S')| / |S| < f(t'). Note that |d#(S')| < |5#/(S')| + 2, since the only edges in H connecting a 
vertex in H' and a vertex not in H' are (ui,wi) and (w We conclude that 

0H < 3 f(t') < 3/ {t/2 - 1) = h{t) , 

where in the second inequality we used the monotonicity of / for t > 1/2. H 

For a local sparse spanning graph algorithm A, we denote by A(G, u, v ) € {0,1} the output of A 
when the input graph is G = {V. E ) and the input edge is {u, v ) € E. The query-answer transcript 
of A on G, where A makes q queries and G is d-regular, is the sequence of triples ((xj, ij, yj))j—i 
where ( Xj,ij ) € V x [d] is the j-th query and ijj € V is the corresponding answer. 

Finally, for a permutation a on V, we denote by cr{G) the graph isomorphic to G on the same 
vertex set, for which (u,v) € E(cr(G)) if and only if (a(u), a(v)) £ E(G). We stress that in what 
follows, the graph cr(G) will not necessarily have the same neighborhood ordering as that of G. 
That is, if y is the z-th neighbor of x in G and cr(v) = x, cr(u) = y then u is not necessarily the z-th 
neighbor of v in a(G). 

Lemma 5. Let G be a 3-regular connected graph of girth g that contains a bridge. Any ,q)-local 
sparse spanning graph algorithm for the graphs isomorphic to G satisfies q > g/2. 
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Proof: Let A be an (^, q)-local sparse spanning graph algorithm for the graphs isomorphic to G , 
and assume, contrary to the claim in the lemma, that q < g/ 2. We shall say that A accepts an edge 
(it, v) in G if it gives a positive answer when queried on (it, v) (that is, (it, v) belongs to the sparse 
spanning graph G'). We will show that with probability 1 over its random coins, A accepts every 
edge of G. This will complete the proof as it means that the number of edges of G that A accepts 
is (1 + |)n, where n is the number of vertices of G, contradicting condition (ii) in Definition 1. 

Let (it, v) € E(G) and assume for contradiction that there is a sequence r of random coins 
for A such that the corresponding deterministic algorithm A r satisfies A r (G,u,v) = 0. Suppose, 
without loss of generality, that the vertex set of G is [n] and that (1,2) is a bridge in G. We 
will construct a permutation a on [n] with a(u) = 1, a(v) = 2 so that the graph a(G) (with an 
appropriate way of ordering the neighbors of each vertex) has the property that the query-answer 
transcript of A r (G,u,v) is identical to that of A r (a(G),u,v). Note that A r (cr(G),u,v) is well 
defined since the input edge (u,v) is indeed an edge of <r(G), and since &(G) is a valid input graph 
to A r being isomorphic to G. Since A r is deterministic, whether or not A r accepts (it, v) depends 
solely on the query-answer transcript. Therefore, the existence of a as above would imply that 
A r (cr(G), u,v) = 0. However, this would contradict condition (i) in Definition 1 since (it, v) is a 
bridge in cr(G). 

Let Q = (xj,ij,yj) q j = i be the query-answer transcript of A r (G, it, v). We first claim that if 
a permutation a and an ordering of the neighbors of each vertex of cr(G), are such that tr(it) = 
1 ,<t(i;) = 2 and for every 1 < j < q the ij -th neighbor of vertex Xj in er(G') is vertex yj then the 
query-answer transcript of A r (G, it, v) is identical to the query-answer transcript of A r (cr(G), u, v ). 
To see this, let the query-answer transcript of A r (cr(G), it, v) be denoted by (x', t', y'j )j =1 . We prove, 
by induction on j. that the two query-answer transcripts are the same when restricted to the first 
1 < j < q queries, that is, (x'-, i') = ( x jAj ) and y'j = Vj for every 1 < j < q. Note that this will also 
imply that q = q' (i.e., that the number of queries is identical). For j = 1 we have (x \, i\) = (aq, ii) 
since A r is deterministic and in both cases the input is (it, v). Our assumption on a thus guarantees 
that we also have y[ = y\ . Suppose our claim holds for the first j — 1 queries. Again, since A r is 
deterministic, the j-th query is determined only by the query-answer transcript of the first j — 1 
queries (and the input edge). Hence, the induction hypothesis implies that ( x'j,ij ) = ( x jAj) an d 
our assumption on a again implies that we also have y'j = y-j. This completes the inductive proof. 

It follows that in order to complete the proof it suffices to find a permutation a and an ordering of 
the neighbors of each vertex, as above. Let again Q = (xj,ij,yj)j=i be the query-answer transcript 
of A r (G,u,v), and let F be the (labeled) graph spanned by the edge set 5 

E{F) = {(xj,yj)} q j=1 U{(u,v)} . 

Since 

\E(F)\<q+l<g/2, (5) 

we have that F is a forest. Let T\,... , X*. be the (labeled) trees in F. For the sake of defining 
o' it will be convenient to consider a single tree T. The edge-set of T consists of E(F) and k — 1 
additional edges. The additional edges do not necessarily belong to G, and are selected as follows. 

5 E(F) might contain the edge (x,y) twice if y is the i-th neighbor of x, x is the j -th neighbor of y and the 
algorithm queried both (x,i) and (y,j)- I n this case we will keep just one copy of (x, y) thus making sure that E(F) 
is indeed a set, and not a multi-set. 
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For each labeled tree Tj, let t* denote an arbitrary vertex of degree smaller than 3. For every 
i € [k — 1], add the edge (L,tj+i)- 

Observe that Equation (5) implies that \E(T)\ < g. Consider a rooted version of T where u is 
the root, and construct a as follows. Set <7 (it) = 1, cr(v) = 2, and define the neighborhood relation 
between u, v in cr(G) as it is in G. That is, if u is the i-th neighbor of v and v is the j-th neighbor 
of u in G then the same holds in cr(G). Suppose we have already defined a(x) for all x at distance 
at most d — 1 from u (in T ) as well as for some vertices at distance d, and let y be a vertex at 
distance d for which a(y) has not been defined yet. Let x be the parent of y in T (whose distance 
from v is thus d — 1) and let us set cr(y) to be a neighbor of cr{x) in G which is not the image of 
any vertex under the a we have defined thus far. Such a vertex exists since G is 3-regular and the 
degree in T is at most 3. If the edge (x, y) is in F then we define the neighborhood relation between 
(t(x) and a(y ) as x and y in G. Once we define a for all vertices of T we arbitrarily extend a to 
a permutation, and extend the neighborhood relation between the vertices in a consistent manner. 
■ 

We are now ready to prove Theorem 3. 

Proof of Theorem 3: Let 

h(x) = -—• 32(log log(8x)) 2 . 
log (3a?) 

It is not hard to check that h : [1/2, oo) —>• M is monotone decreasing. Note that the graph in 
Lemma 2 is /i-non-expanding, since for x > 3, 

h(x) > • 32(log log x) 2 = —^ • (4 log log x) 2 . 

Apply Lemma 4 on the graph(s) in Lemma 2. We get a 3-regular connected n-vertex graph, for 
infinitely many n, that contains a bridge, has girth at least 

loglog((n - l)/2) > log log(n/4) > log log(n) 

2000 ~ 2000 “ 4000 

and is /-non-expanding with 

1 3 1 

fix) = 3h(x/2 — 1) < -—t—— • 96(loglog(4a:)) 2 < -• 96(4 log logo?) 2 < -• (70 log logo?) 2 

log(x/2) log a? logx 

where we assumed x > 3. The proof now follows immediately from Lemma 5. H 
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